Placing Tracks in Context

My tracks in the class corpus

This is a graph which has mapped the engagingness of each song compared to its danceability. The colour scale is based on the tempo of each song. The first noticable aspect of the graph is the seemingly positive correlation between danceability and engagingness which is shown by the red trend line. On average it is clear that in most cases a high danceability value means that same song will have a high engagingness rating aswell. From the colour scaling it can also be noticed that most songs that have high scores for those features also have a higher tempo. This could mean that those features are highly correlated or that the way essentia measured these features is similar in terms of computational analysis. It would be interesting to look at why this correlation seems to be in place, for instance through examining the roll of instrumentallness, or genre in combination with this analysis.

The two points that are highlighted are a song I arranged by myself and one I genreated with suno. What can be seen with these songs is that my own song performs higher in both danceability and engagingness than the ai song while they are the same genre and made with the same intention. Secondly the bpm on the AI track is acurately analysed while the one for my song is not, which could also be an interesting way of analysing the data: can ai analyse ai songs better than human made songs?

filename approachability arousal danceability engagingness instrumentalness tempo valence ai
ahram-j-1 0.2991498 3.417260 0.2711799 0.1026429 0.9141049 84 4.016967 TRUE
ahram-j-2 0.1889460 4.459196 0.4690239 0.5624804 0.3271964 95 3.767471 TRUE
aleksandra-b-1 0.1644350 5.343031 0.8357580 0.5665221 0.3702452 68 4.738314 FALSE
aleksandra-b-2 0.2511401 3.680455 0.6918470 0.1301249 0.8842366 104 4.044941 TRUE
angelo-w-1 0.1614367 3.621579 0.7069914 0.3248783 0.7907066 140 3.301473 FALSE
id approachability arousal danceability engagingness instrumentalness tempo valence ai
berend-b-1 0.1450785 5.021568 0.7396224 0.5278043 0.5858963 143 4.429538 TRUE
berend-b-2 0.2117881 5.656832 0.6107739 0.5786535 0.3487158 75 4.476577 TRUE
desmond-l-1 0.2629817 4.478108 0.2859525 0.4156072 0.6434987 135 3.936315 TRUE
desmond-l-2 0.2929443 5.076702 0.3010519 0.5524329 0.4989389 73 4.316221 TRUE
evan-l-2 0.1081999 5.602334 0.4800247 0.6272448 0.5513844 135 4.445124 TRUE

Information on my submitted tracks

Hidde-s-1:

I produced this song myself. I make music with clubs or festivals in mind as I like to DJ. For this track I tried to combine a mainstream house music sound and combine it with some more raw electronic sounds.

Hidde-s-2:

This is a track I generated with Suno. I asked chat gpt what the key characteristics of a dance track in a sweaty club in Amsterdam were:

“Punchy four-on-the-floor kick, deep rolling bass, crisp shuffled hi-hats, sharp claps, detuned wide synth leads, tension-filled breakdown, rising FX, massive sidechained drop, high-energy, club-focused groove.”

AI and Human generated music are indistuingishable through clustering

Clustering is a technique … Here you can see that there’s is no distinguishable difference between the AI tracks and the human tracks. At least not through clustering based on the dataset. [INSERT REASON:]

AI might procuce less interesting melodies than humans

chromagrams

AI Chromagrams

Chordprogressions between AI and Human made music is indistinguishable

Human Made

AI GENERATED

Tempo is an uninteresting factor

energy novelty:

Here you’ll find an energy novelty function. This measures blah blah blah as can be seen here.

energy novelty function

Spectral novelty text here:

spectral novelty function

The tempo grams below are based on this novelty function. As you can see there won’t be a difference since dance music is made with a drum machine.

tempograms

ai is producing differently?

##cepstograms

Above you can see cepstrograms

##structuring

Above you can see SSMs

can we classify AI made songs?

# A tibble: 2 × 3
  class  precision recall
  <fct>      <dbl>  <dbl>
1 AI         0.385  0.357
2 Non-AI     0.437  0.467

above you can see

# A tibble: 2 × 3
  class  precision recall
  <fct>      <dbl>  <dbl>
1 AI         0.75   0.643
2 Non-AI     0.706  0.8